Skip to content
CSDE News & Events

Flaxman to Present on New Python Package to Create Synthetic/Simulated Individual Data (5/17/23)

Posted: 5/14/2023 (CSDE Research)

On May 17 @4:30pm CSDE Affiiliate Abraham Flaxman will be presenting to the eScience Institute on work his team has been doing to generate pseudo populations – “Introducing pseudopeople: Census-scale simulated data for entity resolution.” The talk will introduce and demo pseudopeople, our new, publicly available Python package that we hope you will use in entity resolution research and development. pseudopeople generates census-scale, simulated population data with adjustable parameters, to replicate key complexities from real challenges in record linkage work. Typical applications of entity resolution and record linkage rely on sensitive and confidential data, and this can be a barrier to reproducible computational research and sometimes even to open communication about innovations and challenges. The value hypothesis of this work is that creating realistic, simulated data (that includes non-confidential simulated versions of sensitive fields, like name, address, and date of birth) will enable more research in census-scale entity resolution and guide the research towards challenges that Census Bureau faces in practice.

Read Full Article

Deadline: 05/17/2023